[OpenVINO] Support Qwen3.5, Qwen3.5-MoE and Qwen3.6#1689
Conversation
Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
|
Thanks for adding support! I've been attempting to test this locally. However, there is an issue with the models generated and/or the OpenVino implementation for Qwen. I've exported copies of The 9B appears to works fine (outside of enabling/disabling thinking but that's a different issue). The other two, though, are causing major issues. First, neither of them will load to GPU. When I attempt to load When I attempt to do the same with Secondly, the models are generating gibberish. I'm able to get them to load to my CPU but the response doesn't make any sense: I can submit submit an issue, provide more information, and/or move this discussion to openvinotoolkit/openvino.genai if necessary. |
Hi @droans, Thanks for reporting this. Regarding CPU issue, you probably use the latest OpenVINO nightly build where we have a regression. We anticipate this PR merged: openvinotoolkit/openvino#35640 Regarding GPU, it is a problem on the GPU side. Can you please create GitHub issue and provide reproducers using optimum-intel API: https://github.com/huggingface/optimum-intel/issues? Best regards, |
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
## Description This PR enables Qwen3.5 model in VLM pipeline (only SDPA use case), updates tests and documentation. Requires huggingface/optimum-intel#1689 for model export. Current WWB accuracy results: ``` Optimum vs HF INFO:whowhatbench.wwb:Metrics for model: models/qwen3_5_0_8b_fp16 INFO:whowhatbench.wwb: similarity 0 0.990854 ``` ``` GenAI vs Optimum (default vision preprocessing) INFO:whowhatbench.wwb:Metrics for model: models/qwen3_5_0_8b_fp16 INFO:whowhatbench.wwb: similarity 0 0.939989 ``` ``` GenAI vs Optimum (VISION_PREPROCESS=CPP) INFO:whowhatbench.wwb:Metrics for model: models/qwen3_5_0_8b_fp16 INFO:whowhatbench.wwb: similarity 0 0.959576 ``` CVS-181273 ## Checklist: - [x] This PR follows [GenAI Contributing guidelines](https://github.com/openvinotoolkit/openvino.genai?tab=contributing-ov-file#contributing). - [x] Tests have been updated or added to cover the new code. - [x] This PR fully addresses the ticket. - [x] I have made corresponding changes to the documentation. --------- Co-authored-by: Copilot <copilot@github.com>
What does this PR do?
Re-created PR #1634
Fixes 181271, 181280, 182003
Installation instructions:
Exporting cmd-line:
optimum-cli export openvino -m Qwen/Qwen3.5-0.8B Qwen3.5-0.8BInference script:
Before submitting